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Abstract — Neural recordings, returns from radars and sonars, 
images in astronomy and single-molecule microscopy can be 
modeled as a linear superposition of a small number of scaled 
and delayed copies of a band-limited or diffraction-limited point 
spread function, which is either determined by the nature or 
designed by the users; in other words, we observe the convolution 
between a point spread function and a sparse spike signal with 
unknown amplitudes and delays. While it is of great Interest 
to accurately resolve the spike signal from as few samples 
as possible, however, when the point spread function is not 
known a priori, this problem is terribly ill-posed. This paper 
proposes a convex optimization framework to simultaneously 
estimate the point spread function as well as the spike signal, 
by mildly constraining the point spread function to lie in a 
known low-dimensional subspace. By applying the lifting trick, 
we obtain an underdetermined linear system of an ensemble 
of signals with joint spectral sparsity, to which atomic norm 
minimization is applied. Under mild randomness assumptions of 
the low-dimensional subspace as well as a separation condition 
of the spike signal, we prove the proposed algorithm, dubbed 
as AtomlcLift, is guaranteed to recover the spike signal up to a 
scaling factor as soon as the number of samples is large enough. 
The extension of AtomicLift to handle noisy measurements is 
also discussed. Numerical examples are provided to validate the 
effectiveness of the proposed approaches. 

Index Terms — blind spikes deconvolution, lifting, atomic norm, 
joint spectral sparsity 

1. Introduction 

In many applications, the goal is to estimate the set of 
delays and amplitudes of point sources contained in a sparse 
spike signal x{t) from its convolution with a band-limited or 
diffraction-limited point spread function (PSF) g{t), which is 
either determined by the nature or designed by the users. This 
describes the problem of estimating target locations in radar 
and sonar, firing times of neurons, direction-of-arrivals in array 
signal processing, etc. 

When the PSF is assumed perfectly known, many algo¬ 
rithms have been developed to retrieve the spike signal, rang¬ 
ing from subspace methods such as MUSIC [1] and ESPRIT 
[2] to total variation minimization [3]. However, in many appli¬ 
cations, the PSF is not known a priori, and must be estimated 
together with the spike model, referred to as blind spikes 
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deconvolution. As an example, in neural spike train decoding, 
the characteristic function of the neurons is determined by the 
nature and needs to be calibrated [4]. As another example, 
in blind channel estimation for wireless communications, the 
transmitted signal is modulated by unknown data symbols, 
therefore the receiver has to perform joint channel estimation 
and symbol decoding. A related problem is blind calibration 
of uniform linear arrays [5], where it is desirable to calibrate 
the gains of the array antennas in a blind fashion. 

Broadly speaking, blind deconvolution of two signals from 
their convolution falls into the category of bilinear inverse 
problems, which is in general ill-posed without further con¬ 
straints. The identifiability, up to an unavoidable scaling ambi¬ 
guity, of these problems has recently been investigated in [6]- 
[8] under the constraints that one or both signals are sparse 
or lie in some known subspace. Along the algorithmic line 
of research, conventional approaches for blind deconvolution 
are typically based on expectation maximization [5], [9], 
which often suffer from local minima and lack performance 
guarantees. Recently, Ahmed, Recht and Romberg developed a 
provably-correct algorithm for blind deconvolution by assum¬ 
ing both signals lie in some known low-dimensional subspaces 
[10] under certain conditions. The key in their approach is 
the so-called lifting trick that translates the problem into 
an under-determined linear system with respect to a lifted 
rank-one matrix, which can be exactly recovered using a 
nuclear norm minimization algorithm. Ling and Strohmer [11] 
further extended this framework by allowing one of the signal 
to be sparse in a known dictionary, and applied fi-norm 
minimization to the lifted sparse matrix, for which sufficient 
conditions for exact recovery are also provided. Finally, Lee 
et. al. proposed an alternating minimization framework [12] to 
the case when both signals are sparse in a known dictionary, 
and established convergence guarantees from a near-optimal 
number of samples under some conditions. 

A. Our Contributions 

In this paper, we study the problem of blind spikes de- 
convolution, where we want to jointly estimate the PSL and 
the spike signal composed of a small number of delayed 
and scaled Dirac functions. Since it is more convenient to 
work in the Lourier domain, we start by sampling the Lourier 
transform of the convolution, giving rise to a measurement 
vector y = gQx + wG C^, where © denotes point-wise 
product, g € is the sampled Lourier transform of the PSL, 
X G is the sampled Lourier transform of the spike signal, 
which is a sum of K complex sinusoids with frequencies 
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determined by the corresponding delays, K is the number of 
spikes, and w G is an additive noise term. Our problem is 
to recover the set of spikes contained in x from the possibly 
noisy observation y. 

Motivated by [10], we assume that the PSF g lies in a 
known low-dimensional subspace B G i.e. g — Bh, 

h G C^, where the orientation of g in the subspace, given 
by h, still needs to be estimated. This assumption is quite 
flexible and holds, at least approximately, in a sizable number 
of applications [10]. Through a novel application of the lifting 
trick, we show now it is possible to translate the measurement 
vector into a set of linear measurements with respect to the 
matrix Z* = xh^ G While it is tempting to directly 

recover Z* from the obtained linear system of equations, it is 
under-determined since we have more unknowns, NL, than the 
number of observations, N. Fortunately, note that the columns 
of Z* can be regarded as an ensemble of spectrally-sparse 
signals with the same spectral support, it is therefore possible 
to motivate this structure in the solution using the recently 
proposed atomic norm for spectrally-sparse ensembles [13], 
[14]. Specifically, we seek the matrix with minimum atomic 
norm that satisfies the set of linear measurements exactly in 
the noiseless setting, and within the noise level in the noisy 
setting. The proposed algorithm is referred to as AtomicLift. 
AtomicLift can be efficiently implemented via semidefinite 
programming using off-the-shelf solvers. Moreover, the spikes 
can be localized by identifying the peaks of a dual polynomial 
constructed from the dual solution of AtomicLift. Numerical 
examples are provided to demonstrate the effectiveness of 
AtomicLift in both noiseless and noisy settings. 

To establish rigorous performance guarantees of AtomicLift 
in the noiseless case, we assume that each row of B is 
identically and independently drawn from a distribution that 
obeys a simple isotropy property and an incoherence property, 
which is motivated by Candes and Plan in their development 
of a RIPless theory of compressed sensing [15]. This implies 
the PSF to have certain “spectral-flatness” property, so that the 
PSF has on average the same energy at different frequencies. 
Moreover, this assumption is flexible to allow the entries in 
each row of B to be correlated. On the other hand, we assume 
the minimum separation between spikes is at least 1 /M, where 
N = 4M -f 1. This condition is the same as the requirement in 
[3], [16] even when the PSF is known perfectly. Under these 
conditions, we show that in the noiseless setting, with high 
probability, AtomicLift recovers the spike signal model up to 
a scaling factor as soon as N is on the order of 0{K^L‘^) 
up to logarithmic factors. Importantly, our result does not 
make randomness assumptions on the spike signal x nor the 
orientation of the PSF in the subspace h. Recall that when 
the PSF is known exactly, it is capable to resolve K spikes 
as soon as N is on the order of 0{K). Therefore, when both 
K and L are not too large, AtomicLift is provably capable 
of blind spikes deconvolution at a price of more samples. The 
stability analysis of AtomicLift in the noisy setting is presented 
elsewhere [17] due to space limits. 

Our proof is based on constructing a valid vector-valued 
dual polynomial that certifies the optimality of the proposed 
convex optimization algorithm with high probability. The 


construction is inspired by [3], [16], where the squared Fejer’s 
kernel is an essential building block in the construction. 
Nonetheless, significant, and nontrivial, modifications are nec¬ 
essary since our dual polynomial is vector-valued rather than 
scalar-valued as in the existing works, and is additionally 
complicated by the special linear operator induced from lifting. 

B. Comparisons with Related Work 

Our approach is inspired by the pioneering work of [10], 
[11], [18], which applied the lifting trick to quadratic and 
bilinear inverse problems such as phase retrieval and blind 
deconvolution. In [10], both the PSF g and the signal x are 
assumed lying in some known subspaces with dimension L 
and K respectively. It is established in [10] that a nuclear norm 
minimization algorithm achieves exact recovery from a near- 
optimal number of samples N > 0{K -\- L) up to logarithmic 
factors, as long as the subspace of g is deterministic and 
satisfies certain spectral-flatness condition, and the subspace 
of X is composed of i.i.d. Gaussian entries. Unfortunately, 
this algorithm cannot be applied in our setting, as the signal 
X does not lie in a known subspace, but rather an unknown 
subspace parameterized by the continuous-valued locations of 
the spikes. 

In [11], Ling and Strohmer extended the framework in [10] 
to allow the signal a; to be a Ff-sparse vector in a random 
Gaussian or random partial DFT matrix. It is established in 
[11] that an -minimization algorithm achieves exact recovery 
as soon as N is on the order of 0{KL) up to logarithmic 
factors. If the locations of the spikes in x lies on the grid of the 
DFT frame, x becomes a sparse vector in the DFT frame, it is 
possible to apply the algorithm proposed in [11]. However, the 
performance guarantees in [11] cannot be applied. Moreover, 
since the locations of the spikes do not necessarily lie on 
any a priori defined grid, it will encounter the basis mismatch 
issue discussed extensively in [19] that potentially results in 
significant performance degeneration. The same holds true for 
the algorithm of Lee et. al. [12] which assumes x is sparse in 
a pre-determined dictionary. 

Finally, our work is related to recent advances in super¬ 
resolution [3], [16], [20]-[23] using total variation or atomic 
norm minimization, but significantly deviates from the existing 
literature since we focus on the more challenging case when 
the PSF is not known. To the best of the author’s knowledge, 
our work provides the first algorithm for blind super-resolution 
with provable performance guarantees. 

C. Paper Organization 

The rest of the paper is organized as follows. Section II 
formulates the problem of blind spikes deconvolution, de¬ 
scribes the proposed AtomicLift algorithm and its performance 
guarantees. Section III provides numerical examples to demon¬ 
strate the performance of AtomicLift. Section IV proves the 
main theorem in this paper. Finally, we conclude and outline 
a few future directions in Section V. 

Throughout the paper, we use boldface capital letters to 
denote matrices A and vectors a, (•)^ to denote the transpose, 
(•)^ to denote the conjugate transpose, and (•)* to denote the 
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conjugate. ||A||p, ||A|| denote the Frobenius norm and the 
spectral norm of a matrix A respectively, and ||a|j 2 denotes 
the £2 norm of a vector a. 

11. AtomicLift for Blind Sparse Spikes 
Deconvolution 

We will first describe the problem formulation and the 
AtomicLift algorithm in the noiseless case, and then extends 
to the noisy case. 


where y = [y_ 2 M, • ■ ■, y 2 M]'^, x = [x- 2 m, • • ■, X 2 m]'^, and 
g = [g- 2 M, ■ ■ ■ ,g 2 M\^■ Interestingly, (5) is also related to 
blind calibration for uniform linear arrays, where g can be 
interpreted as the vector of array antenna gains. 

Clearly, there is an unavoidable scaling ambiguity for the 
identification of g and x from y, since for any nonzero scalar 
/3, y — diag{f3g)x = diag{g){f3x). Our goal is to recover 
both g and x, in particular, the set of spikes T with their 
corresponding amplitudes up to a scaling factor. 


A. Problem Formulation 

Let x{t) be a continuous-time spike signal given as 


K 

x{t) =y^afcj(t-ffc), 


where K is the number of spikes, dfe G C and fk € [0, Tmax) 
are the complex amplitude and delay of the feth spike, 1 < A: < 
AT, and T^ax is the maximum allowable delay of the spikes. 
Let g{t) be the PSF with the bandwidth [—i?max, ^max]- The 
convolution of g{t) and x{t) is given as 

K 

y{t)=x{t)*g{t) = '^dkg{t-fk), (1) 

k=l 


where * denotes convolution. Taking the Fourier transform of 
(1), we have 


Y{f) = Xif)G{f) = j . (2) 

for / G [-Bmax.^max], where X{f), G{f), and Y{f) are 
the Fourier transforms of x{t), g{t), and y{t) respectively. In 
order to digitally process the output, we will uniformly sample 
(2) at W = 4M + 1 points /„ = n = -2M,..., 2M, 

and denote = X{fn), = G(/„), and ?/„ = F(/„). This 
yields 


K 


Un — ‘ Qy 




• gn 


(3) 


\k=l 


where = Tfc/Tmax G [0,1) is the normalized delay. 
From (3), it is straightforward to see that the number of 
samples needs to satisfy 2M > Smax^max so that the delays 
Tfe G [0,1) can be uniquely identified. Since we’re interested 
in algorithmic frameworks that allow identification of the 
spike signal using as small M as possible, without loss of 
generality, we will assume 2M = BmaxTmax, and consider 
the normalized delays G [0,1) in this paper, and the sample 
complexity 2M becomes the bandwidth of g{t). 

We can now rewrite (3) as 


yn = gn - Xn = gn - j , (4) 

where ^ = —2M,..., 2M. 

Denote the set of spike locations as T = {Tk]k=i- 
matrix form, we rewrite (4) as 


y = dmg{g)x, 


(5) 


B. AtomicLift 

The problem of blind spikes deconvolution is extremely ill- 
posed without further constraints [6]. In this paper, inspired 
by [10], we make the assumption that g lies in a known low¬ 
dimensional subspace, given as 

g = Bh, 

where B G is known, h G is unknown, and L 

N. Denote B^ = [b- 2 M, ■ • ■, b 2 M], where bn G is the 
nth column of the matrix B^. We rewrite yn in (4) as 

yn = blhxn = blhelx = el{xh'^)bn, (6) 

where e„ is the nth standard basis vector of Let 

Z* = xh^, using the lifting trick [10], [11], [18], (6) can 
be rewritten as a linear measurement of Z*, 

yn = enZ*bn = {Z*,enbn), n = -2M,... ,2M, 

where {Y,X) = Tr{X^Y). Therefore, y can be regarded a 
set of linear measurements of Z*, i.e. 


y = T’(Z*), (7) 

where X : i—denotes the operator that performs 

the linear mapping (6). 

Since from Z*, we can recover x and g, up to a scaling 
factor, as the left and right singular vectors of Z*, respectively, 
we now wish to recover the matrix Z* from y. Though it 
appears the number of unknowns is much more than the 
number of measurements, a key observation is that Z* can 
be regarded as a signal ensemble where each column signal is 
composed of K complex sinusoids with the same frequencies, 

K 

Z* = xh^ = akc{Tk)b7' G , 

fc=i 


where a = '/N[ai ,..., uk], and 


= 7NI 


^—j2'7T{ — 2M)T 


, . , 1 , . . . , 6 


-j2n{2My. 


iT 


represents a complex sinusoid with the frequency t G [0,1). 
Therefore, it is possible to motivate the joint spectral sparsity 
of the columns of Z* by minimizing the atomic norm [24] 
for joint spectrally-sparse ensembles proposed by the author 
in [13]. To proceed, define the set of atoms as 


A = {A{t, u) = c{t)u^ G G [0,1), ||m||2 = 1} . 


The atomic norm seeks the tightest convex relaxation of 
decomposing a matrix Z G into the smallest number 
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of atoms in A, and is defined as [13] 


W^Wa = inf {f > 0 : Z conv(^)} 


(8) 


inf 

xfce[o,i) 

iifceC^:||iifc||2 = l 



'^CkA{Tk,Uk),Ck > 0 


where conv(^) is the convex hull of A. The corresponding 
decomposition of Z that achieves the atomic norm is called the 
atomic decomposition. Moreover, ||2||^ admits an equivalent 
semidefinite programming (SDP) characterization [13] which 
can be computed efficiently using off-the-shelf solvers: 


IZU = 


uec",wec^ 


inf I ^ Tritoepfu)) + ^ Tr(VP) 
.wec^x^l2 2 

oj, 


toep(u) 

Z^ 


Z 

w 


where toep(M) is the Toeplitz matrix with u as the first 
column. We then propose the following algorithm, denoted 
as AtomicLift, to motivate the joint spectral sparsity of Z by 
seeking the matrix with the smallest atomic norm satisfying 
the measurements: 


Z = argmin |jZ ||_4 s. t. y = X(Z). (9) 

Z^C^XL 


C. Performance Guarantee of AtomicLift 

The main result of this paper is that if we assume the rows 
of the subspace B are i.i.d. drawn from some distribution that 
satisfies the isotropy property and the incoherence property, 
together with a mild separation condition for the spike signal, 
the proposed AtomicLift algorithm provably recovers the PSF 
as well as the spike signal, up to a scaling ambiguity, with 
high probability as long as M is large enough. 

Specifically, we assume each row of B is sampled inde¬ 
pendently and identically from a population F, i.e. bn ~ F, 
n = —2M ,..., 2M. Furthermore, we require F satisfies the 
following properties: 

• Isotropy property: F is said to satisfy the isotropy prop¬ 
erty if for b ^ F, 

Ebb^ = II- 

• Incoherence property: for b = [bi,..., ~ F, define 

the coherence parameter /r of F as the smallest number 
that 

max < y 

l<z<L 

holds. 

The above definitions are motivated by [15] in the development 
of a RIPless theory of compressed sensing. In particular, 
the incoherence parameter /i is a deterministic bound on the 
maximum entry of b, which can be extended to a stochastic 
setting using the stochastic incoherence discussed in [15], so 
that F is allowed to be composed of unbounded sub-Gaussian 
or sub-exponential distributions. It is possible to extend our 
results to the stochastic setting in [15], but for conciseness in 
this paper, we’ll limit ourselves to the deterministic setting. 

We discuss the implications of the above properties for the 
PSF g = Bh when h is arbitrary. Following the isotropy 


property, we have E| 5 „p = |jli ||2 for all n, which means 
that on average, the PSF is “spectrally flat”, having the same 
energy across different frequencies. Also, y > I following 
the isotropy property, where the lower bound can be met, for 
example by selecting b ~ F to be in the form of 

b= (10) 

where / is chosen uniformly at random in [0,1]. 

Furthermore, define the minimum separation of the spike 
signal as 

A= min In - rA, 

which is evaluated as the wrap-around distance on the unit 
circle. The performance guarantee of AtomicLift is presented 
in Theorem 1, which is proven in Section IV. 

Theorem 1. Let M > 64. Assume g lies in a random 
subspace B whose rows are sampled i.i.d. from a population F 
satisfying the isotropy property and the incoherence property, 
with the coherence parameter y. If A > 1/M, then there exists 
a numerical constant C such that 

M > CyK‘^L‘^\og^ 

is sufficient to guarantee that we can recover Z* = xh^ via 
the AtomicLift algorithm with probability at least 1 — 5. 

Theorem 1 allows g to have arbitrary orientation in the 
subspace B, and applies to any deterministic spike signal 
as long as it satisfies the separation condition A > 1/M 
regardless of the amplitudes. The separation is the same as 
required by Candes and Fernandez-Granda [3] for spikes 
deconvolution using total variation minimization even when 
the PSF is known perfectly. Therein they established that 
N = 0{K) measurements are sufficient to exactly recover 
the spike signal. In comparison, our performance guarantee is 
probabilistic that holds with high probability. 

Theorem 1 suggests that as long as M is on the order 
of 0{K‘^Lf ) up to logarithmic factors, AtomicLift provably 
recovers the spike signal with high probability, as long as 
the conditions in Theorem 1 are satisfied. If F is a small 
constant independent of K and M, our bound simplifies to 
M/log^ M > 0{K^), which suggests blind spike deconvolu¬ 
tion is possible at a cost of more measurements. 

D. AtomicLift for noisy data 

We consider a noisy version of (4), where the frequency- 
domain data samples are contaminated by additive noise: 

yn= Xn- Qn+Wn, ( 11 ) 

where w = [w- 2 m, ■ ■ ■ ,'W 2 m]'^ is bounded by ||ut ||2 < e- 
Accordingly, the lifted measurement model becomes 

y = X{Z*) + w. (12) 

The AtomicLift algorithm can be modified with the measure¬ 
ment constraint that obeys the noise level, given as 

■^noisy — argmin \\Z\\a s.t. \\y - X{Z)\\.^ < e. (13) 

Z^C^XL 
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(d) Localization 


Fig. 1: Blind spikes deconvolution using AtomicLift; (a) PSF; (b) convolution between the PSF in (a) and a sparse spike signal; 
(c) deconvolution with the PSF using (b); (d) exact localization of the spikes via the dual polynomial. 


E. Spike Localization 

Define (Y',X)h = Re((l^,X)). The dual norm of || • ||^ 
can be defined as [13] 

||Y|i::i= sup \\Y^c{t)\\^. 

rG[0,l) 

The dual problem of (9) can thus be written as 

p = argmax (p,y)M s.t. ||A’*(p)|l;^ < 1, (14) 

pec« 

and the dual problem of (13) can be written as 
p = argmax(p,y) - ^Ijplja, s.t. ||T’*(p)||^ < 1, (15) 

peC" ^ 

where X*{p) = Write the vector-valued 

dual polynomial Q{t) G as 

Q(t) = 

where p is the solution of the dual problems (14) and (15), 
then the spikes can be localized by the peaks of ||Q(t)|| 2 ; 

r={rG [0,1)1 ||Q(t)||2 = 1}. (16) 

We refer the readers to standard arguments in [16], [25], [14] 
for more details. 

III. Numerical Experiments 

We perform a series of numerical experiments to validate 
the performance of AtomicLift implemented using MOSEK 
[26]'. Without loss of generality, in all numerical experi¬ 
ments, we set the index n G {0, ...,A^ — 1}, rather than 
n G {—2M,..., 2M} as in the previous sections. 

Let N = 64. We first generate the spike locations uniformly 
at random, respecting the minimum separation A > 1/N 
(which is smaller than what the theory requires), whose 
coefficients are generated with a dynamic range of lOdB and 
uniform phase. We generate the subspace B by selecting each 
row i.i.d. following (10) with L = 3, and choose the coefficient 
vector h as an all-one vector. Eig. 1 (a) shows the PSE in the 
time domain, and the convolution with a spike signal with 
K = Q spikes is shown in (b). If the PSE is known, one 
can deconvolve (b) with the PSF and obtain the calibrated 

*The code can be downloaded from http://www2.ece.ohio-state.edu/--chi/ 
papers/atomiclift.m. 


time-domain signal in (c). Clearly Fig. 1 (c) is very different 
from Fig. 1 (b), therefore calibration must be performed if the 
PSF is unknown to avoid severe performance degeneration. 
Fig. 1 (d) demonstrates the exact localization of the spikes via 
computing the dual polynomial of AtomicLift. 

We next examine phase transition of the AtomicLift algo¬ 
rithm. We randomly generate the low-dimensional subspace B 
with i.i.d. standard Gaussian entries, and the coefficient vector 
h with i.i.d. standard Gaussian entries. The spike signal is 
generated in the same fashion as above. For each simulation, 
we compute the normalized error WZ — Z*||F/||Z*|jF, where 
Z is the estimate of the lifted matrix Z* = xh^. First 
fix A = 64. For each pair of {K, L), we run 20 Monte 
Carlo simulations of the AtomicLift algorithm, and claim the 
reconstruction of a simulation is successful if the normalized 
error is below 10“^. Fig. 2 (a) shows the average success 
rate with respect to the number of spikes K and the subspace 
dimension L. For comparison, we also plot the hyperbola 
curve KL = 20 which roughly matches the phase transition 
boundary. Similarly, Fig. 2 (b) shows the average success rate 
with respect to L and N for a fixed AT = 4, and Fig. 2 
(c) shows the average success rate with respect to K and N 
for a fixed L = 3. The phase transition plots suggest that 
AtomicLift succeeds when N > 0{KL), which is better than 
the prediction of our theory. Fig. 2 (d) shows the average 
success rate with respect to K and N as in the same setting of 
Fig. 2 (b), except that the locations of the spikes are randomly 
generated without obeying the separation condition. It can be 
seen that the phase transition is not as sharp, which is in line 
with existing results in applying atomic norm minimization to 
spectrum estimation [16], [25], [14]. 

Finally, we examine the performance of AtomicLift in the 
noisy setting. Using similar setup as Fig. 2 when N = 
64 and L — 3, we introduce additive white Gaussian 
noise as in (11), where each Wn is i.i.d. generated with 
CAf{0,a-‘^). The signal-to-noise ratio (SNR) is defined as 
101og]^Q(||A’(Z*)|j|/(Acr^))dB. Using a standard tail bound 
P (||t (;||2 < ay'N + y/m\og2N'^ > 1 - (2A)-' [27], we 

set e := a in (13). We use (16) to identify 
the spike locations. Fig. 3 shows the recovered source locations 
and their magnitudes. It is worth noting that the dual poly¬ 
nomial will overestimate the number of sources, and further 
model order estimation is still necessary for noisy data. 
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Fig. 2: The average success rate of AtomicLift (a) with respect to the number of spikes K and the subspace dimension L when 
the number of measurements N = 64; (b) with respect to K and N when L = 3; (c) with respect to L and N when K = 4 
for spikes generated satisfying a separation condition A > 1/A; (d) with respect to L and N when AT = 4 for randomly 
generated spike locations. 
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frequency 


Fig. 3; Source localization of AtomicLift in the noisy setting, 
(a) and (b): source locations identified using the dual polyno¬ 
mial and the recovered magnitudes, when K = 6, N = 64, 
L = 3 and SNR = 15dB. 

IV. Proof of Main Theorem 

In this section we prove Theorem 1. First, we describe the 
desired form of a valid vector-valued dual polynomial and 
its properties that will guarantee the optimality of AtomicLift 
in Proposition 1. We next design the dual polynomial with 
the help of the squared Fejer’s kernel [3] in Section IV-A. 
The rest of the proof is then to carefully validate it satisfies 
all the required properties. Proposition 1 presents a sufficient 
condition for the optimality of AtomicLift in (9), whose proof 
is provided in Appendix A. 

Proposition 1. The solution to (9) is unique if there exists a 
vector q G such that the vector-valued dual polynomial 

Q{t) = {X*{q))^c{T) = diag(q)c(T) 

2M 

= 7= E (17) 

^ n^-2M 

satisfies 

= j^sign{al)h* Vr^ e T, (18a) 

||Q(r)|| 2 <l, Vre[0,l]\r, (18b) 

where sign{-) is the complex sign function. 


Q{t) with the valid form that satisfies (18a) and (18b). Our 
objective is then to construct such a valid dual polynomial. 


A. Construction of the dual polynomial 

Without loss of generality, we assume ||h .||2 = 1 from now 
on. Consider the squared Fejer’s kernel and its derivative as 

2M 

n^-2M 

2M 

n=-2M 

where s - J_ (l _ I A 1) (l _ I ttiA 1) 

wnere — m Ai=max(n-M,-M) A ImI) A I M 1/ 

following the definition [3]. To proceed, we define randomized 
matrix-valued versions of K{t) and K'{t) as 

.. 2M 

n^-2M 

2M 

n^-2M 

where the derivatives are entry-wise. Clearly, 

^ 2M 

= M ^ s„E(b„6f)e-^'2—= A(r)Jz., 

n=-2M 

and E1 C'(t) = K'{t)Il, where II is the L x L identity 
matrix following the isotropy property. We then construct the 
vector-valued dual polynomial Q{t) G as 

K K 

Q(t) = Y,K{t- Tk)ak + Y, - ^k)f3k, (19) 

k^l k^l 

where cxk = [ak,i, ■ ■ ■ ,ak,L]'^ G and f3k = 

[/3k, 1 , • ■ •, /3k, l]'^ & fc = 1,..., AT. It is straightforward 

to see that Q{t) has the valid form defined in (17) of 
Proposition 1 . We select the coefficient vectors as the solution 
to the linear equations given below 


Proposition 1 suggests that the solution to AtomicLift in (9) 
is exact and equals to Z* if we can find a dual polynomial 


r Q{Tk) = sign{al)h*, Tk G T, 
{ Q'{Tk) = 0, TkG T. 
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K(0) 

K{ti — tk) 

kK'{Q) 

kK'{ti — tk) 


ai 


'sign{ai)h*' 


K{tk — Tl) 

K{0) 

kK'(tk — ti) 

kK'{0) 


OIK 


fi(gn{a*K)h* 

( 21 ) 

-kK'{0) 

— kK'{ti — Tk) 


—rfK"(Ti — tk) 


k-^/3i 


0 

— kK'{tk — Tl) 

-kK'{Q) 

—is?K"(tk — ti) ■■■ 

-k^K"(0) 

✓ 

_k~^I3k_ 


0 



r 


^ = 


K{0) 

K{ti - Tk) 

kK'{Q) 

kK'{ti - Tk) 

K{tk — ti) 
-kK'{0) 

K{0) 

—kK'{ti—tk) 

kK'{tk — ti) 

KKfO) 

-tfK'fO) 

—nfK'fTi — tk) 

_—kK'{tk — ti) ■ 

-kK'{Q) 

—k^K''{tk — ti) ■■■ 

-tfK'fO) 


2M 

= - T 

M ^ 


H 


( 22 ) 


which can be rewritten as equation (21) after denoting k = 
1/yJ\K" {0)\. For simplicity, we denote the LHS matrix of 
(21) as r G £ 2 LKx 2 LK^ Before inverting (21), we need to 
establish that F is invertible with high probability. 


B. Invertibility of F 

The expectation of F can be given as EF = f = $ 0 Ji, 
where 0 is the Kronecker product, $ G £ 2 Kx 2 K gjygj^ jjj 
(22), where 

r g-i27rTin 

_(j27rnK)e-^27rr^”_ 

The following lemma is useful from [16] regarding 

Lemma 1. [16, Proposition IV. 1] Let A > 1/M. Then is 
invertible and 


IjJ- $11 < 0.3623, ||$|| < 1.3623, ||$~i|| < 1.568. 


Lemma 2. Let 0 < <5 < 1 and A > 1/M. For any x € 
(0, 0.6376), as long as 


M > 


SOpiTL , 

-^log 



(24) 


we have ||F — F|| < x holds with probability at least 1 — <5. 

Denote the event = {||F — f || < %}, which holds 
with probability at least 1 — 5 as long as (24) holds. This 
implies that the matrix F is invertible when holds for 
some 0 < X < 0.6376, since 


||I- F|| < ||I-F|| + ||F-F|| < 0.3623 + x < 1, 

where ||J — f|| = \\I — $|| < 0.3623 from Lemma 1. Under 
let F-i = [L R], where L G and R G 

£ 2 LKxlk^ then we have 


CXi 


'sign(a*)/r*' 


OLK 

= F-i 

sign(aK)^* 

0 

= F[sign(a*) (g) h.*]. (25) 

_k~^I3k_ 


0 



Note that we can write F as a sum of independent random 
matrices as 

^ 2M 

r=— ^ s„(z^„ (g) b„)(iv„ (g) 

n^-2M 
^ 2M 

n^-2M 

then r — f can be written as 

2M 

r-f= ^ 5„, (23) 

n^-2M 

where Sn = ^Sniynl^n) ® {bnbi/ - II) G C'^KLx 2 KL^ 

The following lemma, whose proof is given in Appendix B, 
establishes the concentration of F around F. 


With the above parameterization, Q{t) satisfies the first con¬ 
dition in (18a). Let = [^ where L G and 

R G . Then we have 

f-i = $-1 (g) II = [F R]®Il=[L®Il R®Il\, 

and ||r“^|| = ||$~^|| < 1.568. Then using elementary linear 
algebra that is essentially the same as [16, Corollary IV.5], we 
have the following lemma. 

Lemma 3. On the event with x G (0,1/4], we have 

||r-i-f-i|| <2||f-if X, ||r-l <2||f-i||. 

C. Certifying (18b) 

The rest of the proof is to guarantee Q{t) satisfies (18b) 
with high probability. Let be the mth order entry- 

wise derivative of K{t), m = 0,1, 2, 3. The {£, s)th entry of 
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can be written as 

2M 

n^-2M 

where bnj is the Ah entry of 6„. Further define G 

(^2LKxL’^g 

KK(’"+i)(r-ri)* 

2M 

= Af X! s„(j27rKn)™e'^^’"™(iv„ 0 b„)b^, 

n——2M 

whose expectation can be written as 

2M 

n^-2M 

where 

A:(™)(T-ri)* 

K(m)(^T-TK)* 

^^(m+l)(T_^l)* 

KFf(™+l)(r-TK)* 

^ 2M 

= M T. Sn(j27rKn)™e^'2-™z.„. (26) 

n=-2M 


Let Q^'^'>{t) be the mth order entry-wise derivative of 
(5(t), where the mth order derivative of the Ah entry of Q{t) 
can be written as 

orw-EE ~ '’'k)ak,s + - rfe)/3fc,sj . 

fe = l S=1 

Then, g(™)(T) can be written as 

k™Q(™)(t) = [H(’”)(r)j^i:[sign(a*)(g)/r*] 

■ {L-L®Il + L®Il) [sign(a*) ® h*] 

= ^ (L 0 II) [sign(a*) ® h*] 

+ + (27) 

where 

I^-r\r) = [h(™)(t) - ^ L[sign(a*) ® h*] 




I^\t) = (L - X (g) Ji) [sign(a*) (g) a*]. 

Furthermore, the first term in (27) can be written as 

(X(g) Jl) [sign(a*) (g) a*] 

= «> /L)[isign(a*) 0 h*] 

= C''”HT)^Xsign(a*)j h* := 

where (T)]^Xsign(a*) becomes the scalar¬ 

valued dual polynomial constructed in [3]. Now, k'^Q^'^\t) 
in (27) can be rewritten as 

K™g(™)(T) = -f -f(28) 

The rest of the proof proceeds in the following steps. 
We first establish that (r) - ( t ) A* is 

uniformly bounded for points on a grid Tg^id; next, we extend 
that ( t ) — (t) A *||2 is uniformly bounded 

for all r G [0,1); finally, we show that ||Q(t )||2 < 1, 

Vr e [0, l]\r. 

To bound the central argument is to bound 

||3("i)(t) — H(’”i(r)|| for a fixed r € [0,1), which is based 
on a significantly modified argument of [16, Lemma IV.6] 
whose proof is provided in Appendix C. 

Lemma 4. Let A > 1/M and fix r € [0,1). Let 


= 2 -T7 max 1, 2KLJ — 


and fix a positive number 
( / \i/4 


\Sfj.L J 


if2KL^^^>l, 

otherwise. 


then we have 


<4™L 



holds for m = 0,1, 2, 3 with probability at least 1 — 64e “ 
for some c > 0. 

We then establish that I^\t) is bounded on the grid 
Tgrid € [0,1) in the following lemma, proved in Appendix D. 

Lemma 5. Let 0 < (5 < 1 and A > 1/M. Let Tg^id be a 
finite set defined on [0,1]. As long as 


M > Cp max ■ 


for some constant C, we have 


64|Tgrid| 

(5 


,2 /64|Tgrid| 


,KL\og 


}, (29) 


sup I{ {Td) < e, m = 0,1, 2, 3 > > 1 — 165. 

TdeTgHd 2 


We next bound I 2 {t), which is supplied in the following 
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lemma proved in Appendix E. 


Lemma 6. Let 0 < (5 < 1 and A > 1/M. As long as 

-2j 


M > C/xXMlog 


/or some constant C, we have 


f{m) 


{t) 


< e, m = 0 


,1,2,3} 


(30) 


>1-8(5. 


The next task is to extend the above inequalities to the unit 
interval [0,1] by choosing the grid size properly. Define the 
event 


= 


{| (r) 


< -,m = 0,1 

2-3 


, 2 , 3 } 


Combining Lemma 5 and Lemma 6, with redefining the 
constants, we have the following lemma, which is proved in 
Appendix L. 


Several interesting questions are left for future investiga¬ 
tions. Lirst, the phase transitions of AtomicLift suggests that 
it succeeds when the number of measurements is 0{KL), 
which is better than what the theory guarantees, that is 
0{K'^L^) up to logarithmic factors. We believe it is possible to 
improve the performance guarantee of AtomicLift, for example 
by introducing further randomness assumptions on the spike 
signal [16]. Second, the number of required measurements 
by AtomicLift is much larger than the degrees of freedom, 
which is on the order of 0{K + L). This may due to the fact 
that convex relaxations are not effective at exploiting joint 
structures in Z* [28]. Indeed, the rank-one property of Z* 
is not exploited in AtomicLift. A promising direction is to 
develop non-convex algorithms for blind spikes deconvolution, 
where recent work by Lee et. al. [12] could shed some light. 

Acknowledgment 


Lemma 7. The event E-i holds with probability at least 1 — 5, 
as long as 



for some constant C. 

Linally, we’re ready to certify (18b). Define a small neigh¬ 
borhood of each spike location as = {r : |t —Ti| < pc/M}, 
where pc = 0.08245. Let T^ear = and Tfar = 

[0, l]\Tnear- We will establish the boundedness by splitting 
the analysis for T„ear and Tfar- In fact, we have a stronger 
result in the following lemma, proved in Appendix G. 

Lemma 8. Let A > 1/M. We have 

||Q(r )||2 < 1-Ca, VrGTfar, (32a) 

\\Q{T)\\.^<l-CbM^{T-nf, yreT,, (32b) 

for some constants Ca and Cb satisfying p^Cb < Ca, holds 
with probability at least 1 — 5, as long as 

M > CptK^L^log^ 

for some large enough constant C. 

Putting all these together, we have now proved Theorem 1 
since Q{t) is verified to be a valid dual certificate. 

V. Conclusions 

This paper proposes a convex optimization framework for 
blind spikes deconvolution based on minimizing the atomic 
norm of a jointly spectrally-sparse ensemble after lifting the 
bilinear inverse problem into an under-determined linear in¬ 
verse problem, by constraining the PSL to be in a known low¬ 
dimensional subspace. Under mild conditions, the proposed 
AtomicLift algorithm provably recovers the spike signal as 
well as the PSL up to a scaling ambiguity as long as the 
number of measurements is large enough. 


The author thanks Louis L. Scharf for motivation to work on 
the problem of blind spikes deconvolution, and Zhi Tian, Yuxin 
Chen and Gongguo Tang for useful discussions. The author 
also gratefully acknowledges the anonymous reviewers for 
their constructive feedbacks that greatly improve the quality 
of this paper. 

Appendix A 

Proof of Proposition 1 

Proof: Lirst, any q satisfying (18a) and (18b) is dual 
feasible. We have 

II^IU > ll^1UI|,35*(q)||/^ > 

X*{q),'^akc{Tk)h^\ 

k^l / ]R 

K 

^ Re {al {X* (q), c{Tk)h^)) 

k^l 

K 

= ^Re {al{Q^{Tk),h^)) 

k^l 

K K 

= ^Re(ofeSign(afc)) = ^ \ak\ > \\Z*\\j^. 

k=l k=l 

Hence {X*{q), Z*)^ — ||Z*H_ 4 . By strong duality we have 
Z* is primal optimal and q is dual optimal. 

Lor uniqueness, suppose Z is another optimal solution. If Z 
and Z* have the same support set T, they must coincide since 
the set of atoms in T is independent. Let Z = dk c{fk)hl 

be its atomic decomposition where dfc > 0, with some support 
fk 4- T”- We then have 

(Y*(q),Z)R 

= ^ Re (a/(Q"(/), 11^)) 

< X! ^k\\Q{Tk)\\2\\hkh + ^l\\Qip)h\\hl\\2 

< ^ dk\\hk\\2 + ^ di\\hi\\2 = ll-^IU, 

Tk&T niT 
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which contradicts strong duality. Therefore the optimal solu¬ 
tion of (9) is unique. ■ 

Appendix B 

Proof of Proposition 2 


2M 

= M E Sn(j27r«nre^'2-"V„ 0 - Ji) 


a=-2M 


where = j^Sn{j‘27TKn)'^e^‘^^^'^Un 0 {bnb^ - II) ^ 


^2LKxL 


’s are independent random matrices with zero mean. 


Proof: We apply the non-commutative Bernstein’s in¬ 
equality [29] to (23). We have ES„ = 0, and 

Define 

Vm = 

_ 3(™) 

("t) 


||5'„|| = ^\Sn\ ■ - Il\\ 

= 

sup Re 

'u^ 

/3(m)(T)_3(m)(^)\ 


1 

kl|2 = l,|k||2 = l 


\ / 


^ T7naax|s„| • ||iz„||2 . max{||b„||2, ||J^| 

M n 

< — ■ (^K + K (27rnK)^^ max{pL, 1} 
UK^iL 


< 


M 


:= R. 


2M 


n=-2M 

1 

M2 


1 

M^ 


E 

2M 

E ® (bnb^ “ Il)] 

!M 

[(iznizf) (g) (6„6f - /l)]^ I 

E 4 lkn|| 2 («^n^'f) 

E[{b„b^ -lL){b„b^ -II)] 


n——2M 


2M 


a=-2M 


M2 


< 


UK^J^L 

M2 


2M 

E 4ll*^n||2(*^n^'f) «) E (||6„||2b„b^ - /l) 

^-2M 

2M 




2=-2M 


UKfxL 

< —— max s,i 


< 


M 

14:Kfj,L II- 


2M 


E Sn{l^nl^n)®lL 


t=-2M 


p|| < 20Ar/x£ _ ^2 


Proof: We first write 
3(m)(^)_gM(^) ^ ^ Wi™) 


2M 


2M 


sup Re ( 

lkll2 = l,lkll2 = l„^M ^ 




where in the first inequality we used that for two positive 
semidefinite matrices A and B, \\A—B II < max{|lA||, ||B||}, 
in the second inequality we used that max„ |s„| < 1 and the 
incoherence property of F, in the third inequality we used that 
+ max|„|< 2 M (27rnK)^^ < 14 for M > 4 [3]. Moreover, 


We apply Talagrand’s concentration inequality in Lemma 9 to 
bound Vm- 

Lemma 9. [30, Talagrand’s concentration inequality] Let 
{Yj} be a finite sequence of independent random variables 
taking values in a Banach space and V be defined as 
V = sup/jg^ h(Vj) for a countable family of real valued 
functions %. Assume that |h| < B and Eh{Yj) = 0 for all 
h G R. and every j. Then for all f > 0, 

P{|y-EV| > <} < Wexp (-^ log (l + 


where cr^ = sup^^g^ ^ = sup^ew |Ej ^(^i) 

and K is a numerical constant. 

Let = Re (u^wiT^v^, then E/r(wi'"^) = 0. 

We compute the following bounds: 


Re 


H 




< 4™+W^ := Bm, 

where we have used max„ |s„| < 1, |j27r«:n| < 4, and 
ll^^nlb < s/l4:K for M > 4 [3], and ||b „||2 < pL. Since 


E 


/ 2M \ 


< E 

2M 




\t) 


_ 


H 


\t) 


M " " - M 
where we used |[iz„||2(iy„zy«) ^ ||b„||ib„b^ ^ 

pLbnb^, and ||r|| = ||$|i < 1.3623 from Lemma 1. The 
notation A < B indicates B — A is a positive semidefinite 
matrix. Applying Bernstein’s inequality to (23) will then finish 
the proof. ■ 

Appendix C 
Proof of Lemma 4 


= ETr y] Wi™) E 

\n^-2M / \n'^-2M 

( 2M \ 

y] wtHwtY] 

n^-2M / 


= E |sniy27rKn)2”"Tr(iz„iyy) • Tr(E(b„bf - II)^) 


2M 


M2 


n=-2M 
|2m / 2M 

M^ 

/\2m 




=-2M 


n=-2M 


- M ^ - M ’ 

where in the first inequality we used ||A|| < ||A|jF, followed 
by exchanging the order of taking expectation and the trace 
and using the independence of . In the second inequality, 
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we again used max„ |s„| < 1, \j2TTKn\ < 4, and ||i>'n ||2 < 
^/lAK for M > 4 [3], and Hbnlli ^ The last inequality 
follows from Tr(4>) = 2K. This yields 


Conditioned on n S 2 , we have 


= ¥.Vm < (EC, 


2\1/2 


< 4™L 


2nK 

M 


Next, we have 

< E 


u 








H 


E((6„b^ - lL)vv^{bnb^ - II))] u 


< ^|s„P(27rKn)^™M^ [uni^^ ® I^LIl] 

H 


u 


- [snl^nK ® Il\ U 

where in the first inequality we used 

E((h„6^ - lL)vv^{b^b^ - II)) 

= nKv\^Kb’^)-vv^ 

< WbnWllh - VV^ ^ ^J,LIL■ 

Therefore, 

2M 


n=-2M 


= ^ Eh\wi^^) 

2M 

E 


M2 




H^I 


.ra=-2M 


U 


Am) 


(t) 


< 




(m) 


11 sign (a*) (g) h* 


< 4 :VK\m,, 


where we used ||-Lfc|| < ||-L|| < 2 ||r ^|| < 4 from Lemma 3. 
Set Am < e/{A^/K). Applying the union bound and getting 
rid of the conditional probability we have 


sup 


V TdeXgrid 



(t) ^ > e 1 < 64|Tgrid|e 


+ P(^f.x). 

(33) 

where the second term < <5 when M > 

soi^KL Set = c~^ log ^ so the first 

term in (33) is also bounded by 6. 

r— / \ 1/4 

If 2KL\ >1, from Lemma 4 we have a < I 1 


which gives M > 

< 2^^_L‘ /2Mf 


64 


\8^^,L J 

fxL \og^ Moreover, since 


> 


22"*+il 


^ Then 

2fiK- 


< ^42"*||r|i < 

- M " " - M 

where in the last inequality we used ||r|| < 2. Applying 
Lemma 9 we have 

P{|||H(™)(T)-H('")(r) -E > f} 

< 16 exp (- -4- log (^1 + ^ 


M ’ we have ^ 

(33) holds as long as 22^+1 l \J 2 ^ — which gives 

M > 42 ™+ 7 Mf^. If 2KL\^ < 1, from Lemma 4 we 

have a < which gives M > ^^KL\og 

The rest follows similarly as in the previous case. Set x = 1/4, 
the proof is complete by combining all lower bounds on M 
and absorbing the constants. ■ 

Appendix E 
Proof of Lemma 6 

Proof: To bound we recall from [16] that 

|jC*'™H'7)l|2 < Cl for some universal constant C, hence 
||3(™)(t)|| = C) Tl|| < Cl. Moreover, conditioned 

on £ 1 ^^ with X € [0,1/4), 


\r) 


< 


■>(m) 


(t-) 


\ CBm V 'T’m P BmEVm 

for some constant C. Suppose 

CTm = max(CT^,BmEC„) = 2'‘’”+i^max|l,2A:Ly^| , 
and fix a obeying a < OmiBm, we have 


\L - L® Jl|| • ||sign(a*) (g) h *\\2 
. Plug this into (24), 


C'Va 


>4'"LV^+aam| < 166"“' 

for some constant c > 0, which finishes the proof. ■ 

Appendix D 
Proof of Lemma 5 

Proof: Define the event 


< c'/ax, 

To bound this by e, set x = 
we require M > C/tiT^L log (4^P) for some large enough 
constant C. ■ 

Appendix F 
Proof of Lemma 7 

Proof: First of all, we have 

K’”|Q4^(r)| < CViM|lL||||sign(a*)Oh*|| 

< cVk^Vk = ckVl, 

then for any fixed ti, T 2 , by the Bernstein’s polynomial 
inequality, we have 


£2 = 




^dSTgrid 




< |g-i27rTi _ g-f2'n-T2| 


< 2^™L 


2m T , / 


M 


ttfJm ^— A, 




sup 


dz 


< 47r|Ti — T 2 I 2 M sup 
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< CMKL, 

for some constant C. Therefore, we have 


+ K^Re ((/r*g"(T))^(Q(T) - h*Q{T))) 

< K^Re ((g"(T))*Q(T)) + e(l + e) + 1.7068e, 




1/2 


< CMKL^-^ < CM*, 


we can select the grid size such that for any t G [ 0 , 1 ], there 
exists a point Td G Tg^jd satisfying |r —r^l < 35 ^- The grid 
size can be 3CM*/e. With this selection, we have 


<6, VrCp,!], 


as long as M satisfies the requirement in (31). 


Appendix G 
Proof of Lemma 8 

We first record some useful properties of Q{t) below, where 
most of them are borrowed from [16, Proposition IV.2]. 

Lemma 10. Assume A > 1/M. Then, for r € Tfar, |(5 ('f)| < 
0.99992, and for r G Tnear, 

(Qr('f)Qr(t) + |(5'(t)P + |g/(r)||Qd(T)|) < -0.07865, 

K^|g"(T)| < 1.7068, and K|g'(T)| < 0.4346 . 

Note that the bound k'^\Q"{t)\ < 1.7068 in Lemma 10 is 
new, which follows directly from rewriting [3, equation (2.27)] 
to bound k^|(5^(t)|, hence we omit the details. 

Proof: First assume r G Tfar. Using Lemma 7 we have 


||g(r)||2 = ||/r*Q(T)||2 + ||Q(r)-fr*0(r)||2 

< WQ{t )\\2 + e < 0.99992 + e < 1, 


as long as e < 10 “^. 

Next assume r G Tnear- Since our choice of the coefficients 
implies that 


dllQ(T) 


dr 


= Re 2Q(r) 


H 


dQjr) 
dr 

= 2QR(T)Qfl(T) + 2QJ{t)Q'j{t) = 0, 


it is sufficient to establish ^ < 0 for t G Tnear- First 

of all, 

+ ||Q„'(r)||i 

+ Qi{rfQ/\r) + \\Qj'{T)\\l 
= Re((g"(r))^Q(r)) + ||Q'(r)||2. 


Since 


I|nQ'('f)|P = I|kQ^('^) - nhfQfr) + K/r*Q'(T)|p 
<e^ + K^|Q'(r)p + 0.8692e 


and 


K^Re ((Q"(r))^Q(T)) 

= it^Re (( 0 "(t))*Q(t)) + N^Re - h*Q"{ t))**Q{t)) 


therefore, 

+ 1.7068e + + k^\Q'{t)\^ + 0.8692e 

<tt2(Qfl(r)QR"(r) + |Q(r)f 

+ l0/(T)||Q/"(r)|) + 2e2 + 3.576e 
< -7.865 X 10-2 + 2e2 + 3.576e < 0 

for a small enough numerical constant e. Plug the choice of 
e into the sample complexity requirement of Lemma 7, the 
proof is accomplished by keeping only the dominating terms. 
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